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Abstract 

We derive lower bounds for tradeoffs between the communication C and space S for com- 
municating circuits. Tfie first such bound applies to quantum circuits. If for any problem 
J> I f : X X Y ^ Z the multicolor discrepancy of the communication matrix of / is 1/2'', then 

OO . any bounded error quantum protocol with space S, in which Alice receives some I inputs, Bob 

\ r inputs, and they compute f(xi,yj) for the I ■ r pairs of inputs {xi,yj) needs communication 

' C = Vl{lrd\og\Z\/ S). In particular, n x n-matrix multiplication over a finite field F requires 

C = Q{n^\o^ \F\/ S), matrix-vector multiplication C = Q{n'^\o^ \F\/ S). We then turn to 
■rj" ' randomized bounded error protocols, and derive the bounds C = 0(71^/5^) for Boolean ma- 

\ trix multiplication and C = fl{n'^/S'^) for Boolean matrix-vector multiplication, utilizing a 

new direct product result for the one-sided rectangle lower bound on randomized communica- 
JDhI tion complexity. These results imply a separation between quantum and randomized protocols 

when compared to quantum bounds in jKSWOl) and partially answer a question by Beame et 
Ch ■ al. | ]jTY94| . 

y. '. 1 Introduction 
> . 

^ . 1.1 Quantum Tradeoffs 

I Computational tradeoff results show how spending of one resource must be increased when avail- 

ability of another resource is limited in solving computational problems. Results of this type have 
first been established by Cobham |Cob66j . and have been found to describe nicely the joint behavior 
of computational resources in many cases. Among the most important such results are time-space 
tradeoffs, due to the prominence of these two resources. It can be shown that e.g. (classically) 
sorting n numbers requires that the product of time and space is n{n'^) |Hea91j . and time 0{n'^/S) 
can also be achieved in a reasonable model of computation for all logn < S < n/logn |PR98j . 

The importance of such results lies in the fact that they capture the joint behavior of important 
resources for many interesting problems as well as in the possibility to prove superlinear lower 
bounds for tradeoffs, while superlinear lower bounds for single computational resources can usually 
not be obtained with current techniques. 

* Supported by DFG grant KL 1470/1. Work partially done at Department of Computer Science, University of 
Calgary, supported by Canada's NSERC and MITACS. 
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Quantum computing is an active research area offering interesting possibilities to obtain im- 
proved solutions to information processing tasks by employing computing devices based on quantum 
physics, see e.g. |NCOO| for a nice introduction into the field. Since the number of known quantum 
algorithms is rather small, it is interesting to see which problems might be candidates for quantum 
speedups. Naturally we may also consider tradeoffs between resources in the quantum case. It 
is known that e.g. quantum time-space tradeoffs for sorting are quite different from the classical 
tradeoffs, namely T'^S = 0(n^) |KSW04j (for an earlier result see |A04j ) . This shorthand notation 
is meant as follows: the lower bound says that for all S any algorithm with space S needs time 
fl{n^/'^/^^S), while the upper bound says that (in this case for all log^ n < S < n) there is a space 
S algorithm with time 0(n^/^/\/S'). 

Communication-space tradeoffs can be viewed as a generalization of time-space tradeoffs. Study 
of these has been initiated in a restricted model by Lam et al. |LTT92j . and several tight results in 
a general model have been given by Beame et al. |BTY94| . In the model they consider two players 
only restricted by limited workspace communicate to compute a function together. Note that 
whereas communication-space tradeoffs always imply time-space tradeoffs, the converse is not true: 
e.g. if players Alice and Bob receive a list of n numbers with O(logn) bits each, then computing 
the sorted list of these can be done deterministically with communication 0(n log n) and space 
O(logn). 

Most of the results in this paper are related to the complexity of matrix multiplication. The 
foremost question of this kind is of course whether quantum algorithms can break the current barrier 
of 0(n^ '^^^) for the time-complexity of matrix multiplication [CWPOj (it has recently been shown 
that checking matrix multiplication is actually easier in the quantum case than in the classical case, 
and can be done in time 0{n^^^) |BS04j ). In this paper we investigate the communication-space 
tradeoff complexity of matrix multiplication and matrix-vector multiplication. Communication- 
space tradeoffs in the quantum setting have recently been established |KSW04j for Boolean matrix- 
vector product and matrix multiplication. In the former problem there are an n x n matrix A and a 
vector b of dimension n (given to Alice resp. to Bob), and the goal is to compute the vector c = Ab, 
where Cj = V"^]^ (A[i, j] A bj). In the latter problem of Boolean matrix multiplication two matrices 
have to be multiplied with the same type of Boolean product. The paper tK5W04| gives tight lower 
and upper bounds for these problems, namely C^5 = 0(n^) for Boolean matrix multiplication and 
C'^S = G(n^) for Boolean matrix-vector multiplication. 

Here we first study these problems in the case when the matrix product is not defined by for the 
Boolean operations A and V (which form a semiring with {0, 1}), but over finite fields, and again 
for quantum circuits. Later we go back to the Boolean product and study the classical complexities 
of these problems, in order to get a quantum/classical separation for the Boolean case. All these 
results are collected in the following table. 
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this paper 
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Note that in the above table all upper bounds hold for log n < S < n, and that the results from 
|BTY94j are actually shown in a slightly different model (branching programs that communicate 
field elements at unit cost) and hence stated with a factor of log \F\ less there. 

1.2 Direct Product Results 

As in |KSW04| we use direct product type results to obtain quantum communication-space tradeoff 
lower bounds for functions with many outputs. In this approach (as in previous proofs concerning 
such tradeoffs) a space bounded circuit computing a function is decomposed into slices containing 
a certain amount of communication. Such a circuit slice starts with a (possibly complicated) initial 
state computed by the gates in previous slices, but this state can be replaced by the totally mixed 
state at the cost of reducing the success probability by a factor of 1/2'^, where S is the space bound. 
If we manage to show that a circuit with the given resources (but with no initial information) can 
compute k output bits of the function only with success probability exponentially small in k, then 
k = 0{S), and we can prove a tradeoff result by concluding that the number of circuit slices times 
0{S) must be larger than the number of output bits. 

A direct product result says that when solving k instances of a problem simultaneously the 
success probability will go down exponentially in k. There are two different types of direct product 
results. In a strong direct product result we try to solve k instances with k times the resources 
that allow us to solve the problem on one instance with probability 2/3. In a weak direct product 
theorem we have only the same amount of resources as for one instance. 

Our approach is to show direct product type results for lower bound techniques that work for 
quantum resp. randomized communication complexity of functions /. We focus on lower bound 
methods defined in terms of the properties of rectangles in the communication matrix of /. There 
are several techniques available now for proving lower bounds on the quantum communication 
complexity (see |Ran31 iKlaOlp . The earliest such technique was the discrepancy bound first ap- 
plied to quantum communication by Kremer |Kre95j . This bound is also related to the majority 
nondeterministic communication complexity IKlaOlj . 

Definition 1 Let u be a distribution on X x Y and f be any function f : X x Y ^ {0, 1}. Then 
let discy{f) = max/j |i/(i? n /~"^(0)) — u{Rr\ /^^(1))|, where R runs over all rectangles in the 
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communication matrix of f (see Section \2.l3\) . 

In the rest of the paper /x wih always denote the uniform distribution on some domain. disc{f) 
will be a shorthand for disCfj,{f). We will also refer to the term maximized above as the discrepancy 
of a particular rectangle. Since we are dealing with multiple output problems, also a notion of 
multicolor discrepancy we are going to define later will be useful. — \og{disc{f)) gives a lower 
bound on the quantum communication complexity |Kre95j . 

As Shaltiel |Sha01j has pointed out, in many cases strong direct product theorems do not 
hold. He however gives a strong direct product theorem for the discrepancy bound, or rather a 
XOR-lemma: he shows that 

disc{®i=i_kf{xi)) < disc{f{x)f'^''\ 

Previously Parnafes et al. |PEW97j showed a general direct product theorem for classical communi- 
cation complexity, but in their result the success probability is only shown to go down exponentially 
in k/c, where c is the communication complexity of the problem on one instance, so this result can- 
not be used for deriving good tradeoff bounds. Klauck et al. jKSW04j have recently given a strong 
direct product theorem for computing k instances of the Disjointness problem in quantum commu- 
nication complexity. 

Instead of the usual direct product formulation (fe independent instances of a problem have to 
be solved) we first focus on the following setup (a generalized form of matrix multiplication) : Alice 
receives I inputs, Bob receives r inputs, and they want to compute f{xi,yj) for all Ir pairs of inputs 
for some function /. We denote this problem by fi^r- We will show that when the communication 
in a quantum protocol is smaller than the discrepancy bound (for one instance) then the success 
probability of computing some k of the outputs of fi^r goes down exponentially in k (for all k 
smaller than the discrepancy bound), and refer to such a result as a bipartite product result. This 
differs from Shaltiel's direct product result for discrepancy |Shanij in three ways: first, it only holds 
when the communication is smaller than the discrepancy bound for one instance (like a weak direct 
product result), secondly, it deals with correlated input instances (in the described bipartite way). 
Furthermore it is not about discrepancy of the XOR of the outputs for k instances, but rather 
about the multicolor discrepancy. 

1.3 Our Results 

The first lower bound result of this paper is the following: 

Theorem 1 Let f : X x Y —^ {0,1} with disc{f) < 1/2'^ . Then any quantum protocol using space 
S that computes fi^r needs communication Q{dlr/S). 

A completely analogous statement can be made for functions f : X x Y ^ Z for some set Z 
of size larger than two and multicolor discrepancy, where the lower bound is larger by a factor of 
log|Z|. 

The inner product function over a field F is IP^{x,y) = Yl^=i^i ' Vi with operations over F. 

IpGF{2) has 

been considered frequently in communication complexity theory. It is known that its 
quantum communication complexity is Q{n) (the lower bound can be proved using discrepancy 
|Kre95j ) . Note that IPnn corresponds to the multiplication of two n x n matrices over F, while 
IPni is the matrix- vector product. It is well known that disc{IP'^^^'^^) < 2~"/^ (see |KN97j ). 
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A generalization of this result given by Mansour et al. |MNT93] implies similar bounds on the 
multicolor discrepancy of inner products over larger fields. Together with a trivial deterministic 
algorithm in the model of communicating circuits we get the following corollary. 

Corollary 1 Assume logn < S < nlog \ F\. 

IPnn '^^''^ ^6 computed by a deterministic protocol with space S and communication 0{n^ log'^(|F|)/S'), 
and any bounded error quantum protocol with space S needs communication fl{n^ log^ {\F\) / S) for 
this problem. 

IPni can be computed by a deterministic protocol with space S and communication 0{v? \o^{\F\)/S), 
and any bounded error quantum protocol with space S needs communication VL{n'^ {\F\) / S) for 
this problem. 

Using a lemma from |MNT93] (also employed in |BTY94j ) we are also able to give a lower 
bound for pairwise universal hash functions. 

Definition 2 A pairwise universal family Y of hash functions from a set X to a set Z has the 
following properties when h GY is chosen uniformly at random: 

1. For any x G X : h{X) is uniformly distributed in Z. 

2. For any x, x' G X with x ^ x' , and any z, z' G Z, the events h{x) = z and h{x') = z' are 
independent. 

In the problem of evaluating a hash function by a protocol Alice gets x £ X , Bob gets a function 
h gY, and they compute h{x). 

Corollary 2 Any bounded error quantum protocol that evaluates a pairwise universal family of hash 
functions using space S needs communication at least il(min{log(|X|) • log(|.Z^|)/S' , log^(|.Z^|)/S'}). 

Beame et al. |BTY94j have established the first term in the above expression as a lower bound for 
randomized communicating circuits. Hence our quantum lower bound is weaker for hash functions 
that map to a small domain. 

There are many examples of pairwise universal hash function, see |MNT93| . Let us just mention 
the function / : GF{r) x GF{r)'^ GF{r) defined by /(x, (a, b)) = a ■ x + b. If n = [logr] then 
this function has a quantum communication tradeoff CS = r2(n^). Also there are universal hash 
functions that can be reduced to matrix-multiplication and matrix- vector multiplication over finite 
fields, and we could have deduced the result about matrix-vector multiplication in Corollary ^ 
from the above result. The result about matrix multiplication would not follow, since the standard 
reduction from convolution (see MNT93 , matrix multiplication itself is not a hash function) has 
the problem that for convolution the log^ \ Z\ term is much smaller than the log \X\ -log \ Z\ term, and 
we would not get a good lower bound. Also not every function // where / has small discrepancy, 
is a universal hash function. 

We then turn to classical communication-space tradeoffs for Boolean matrix and Boolean 
matrix-vector multiplication. We show a weak direct product theorem for the one-sided rect- 
angle bound on randomized communication complexity, which allows us to deduce a weak direct 
product theorem for the classical complexity of the Disjointness problem. Using this we can show 
a communication-space tradeoff lower bound for Boolean matrix multiplication, a problem posed 
by Beame et al. |BTY94j . 
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In the Disjointness problem Alice has an n-bit input x and Bob has an n-bit input y. These x 
and y represent sets, and DISJ{x, y) = 1 iff those sets are disjoint. Note that DISJ is NOR{xAy), 
where x A y is the n-bit string obtained by bitwise AND-ing x and y. The communication com- 
plexity of DISJ has been well studied: it takes G(n) communication in the classical (randomized) 
world |K?^lRa92] and e(V^) in the quantum world |B( : W98L IhWHI IAAO.SL iRaHH] . A strong di- 
rect product theorem for the quantum complexity of Disjointness has been established in |KS Wn4j . 
but the randomized case was left open. DISJn,n is (the bitwise negation of) the Boolean matrix 
product. 

Theorem 2 There are constants e, 7 > such that when Alice and Boh have k < en instances of 
the Disjointness problem on n hits each, and they perform a classical protocol with communication 
en, then the success prohahility of computing all these instances simultaneously correct is at most 

An application of this gives a classical communication-space tradeoff. 

Theorem 3 For the prohlem DISJn,n (Boolean matrix multiplication) every randomized space S 
protocol with hounded error needs communication 0(n^/S^). 

For the prohlem DISJn,i (Boolean matrix-vector multiplication) every randomized space S pro- 
tocol with hounded error needs communication J7(n^/S'^). 

The proof is in Appendix^ Obvious upper bounds are 0{n^/S) resp. 0{n^ / S) for all logn < 
S < n. No lower bound was known prior to the recent quantum bounds in KSW04J. Note 
that the known quantum bounds for these problems are tight as mentioned above. For small 
S we still get near-optimal separation results, e.g. for polylogarithmic space quantum protocols 
for Boolean matrix multiplication need communication Q{p?'^), classical protocols 0(n^). The 
reason we are able to analyze the quantum situation more satisfactorily is the connection between 
quantum protocols and polynomials exhibited by Razborov jRa03j . allowing algebraic instead of 
combinatorial arguments. 

2 Definitions and Preliminaries 
2.1 Communicating Quantum Circuits 

In the model of quantum communication complexity, two players Alice and Bob compute a function 
/ on distributed inputs x and y. The complexity measure of interest in this setting is the amount 
of communication. The players follow some predefined protocol that consists of local unitary 
operations, and the exchange of qubits. The communication cost of a protocol is the maximal 
number of qubits exchanged for any input. In the standard model of communication complexity 
Alice and Bob are computationally unbounded entities, but we are also interested in what happens 
if they have bounded memory, i.e., they work with a bounded number of qubits. To this end we 
model Alice and Bob as communicating quantum circuits, following Yao |Yao93j . 

A pair of communicating quantum circuits is actually a single quantum circuit partitioned into 
two parts. The allowed operations are local unitary operations and access to the inputs that are 
given by oracles. Alice's part of the circuit may use oracle gates to read single bits from her input, 
and Bob's part of the circuit may do so for his input. The communication C between the two parties 
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is simply the number of wires carrying qubits that cross between the two parts of the circuit. A 
pair of communicating quantum circuits uses space S, if the whole circuit works on S qubits. 

In the problems we consider, the number of outputs is much larger than the memory of the 
players. Therefore we use the following output convention. The player who computes the value 
of an output sends this value to the other player at a predetermined point in the protocol, who is 
then allowed to forget the output. In order to make the model as general as possible, we allow the 
players to do local measurements, and to throw qubits away as well as pick up some fresh qubits. 
The space requirement only demands that at any given time no more than S qubits are in use in 
the whole circuit. 

For more quantum background we refer to INCDOj . 

2.2 The Discrepancy Lower Bound and Other Rectangle Bounds 

Definition 3 The communication matrix Mf a function f : X x Y ^ Z with rows and columns 
corresponding to X,Y is defined by Mf{x,y) = f{x,y). 

A rectangle is a product set in X x Y . Rectangles are usually labelled, an i-rectangle being 
labelled with i £ Z. i{R) gives the label of R. 

We will make use of the following simple observation. 

Proposition 1 Let i? C x Y^' be a rectangle. Then the set 



is a rectangle in X x Y for all fixed values Ua €z X and Vf, £Y , 1 < a,b < l,r . 

The discrepancy bound has been defined above. The application of the discrepancy bound to 
communication complexity is as follows (see |Kre95j ): 

Fact 2 A quantum protocol which computes a function f : X x Y ^ {0, 1} correctly with prob- 
ability 1/2 + e over a distribution v on the inputs (and over its measurements) needs at least 
Q(log{e / disc^{f))) communication. 

We will use the following generalization of discrepancy to matrices whose entries have more 
than two different values. 

Definition 4 For a matrix M with M{x, y) £ Z for some finite set Z we define its multicolor 
discrepancy as 



R zez 

where the maximization is over all rectangles R in M . 

The above definition corresponds to the notion of strong multicolor discrepancy used previously 
in communication complexity theory by Babai et al. |BHKOlj . A matrix with high multicolor 
discrepancy has rectangles whose measure of one color is very different from the average ^{R)/\Z\. 
Note that we have defined this only for the uniform distribution fi here, and that only functions 
for which all outputs have almost equal probabilities are good candidates for small multicolor 
discrepancy (e.g. the inner product over finite fields). 

We next define the one-sided rectangle bound on randomized communication complexity, see 



R'[u,v] = {xi,yj \ ui,.. . ,Ui^i,Xi,Ui+i, . . . ,ui,vi, . . . 



,Vj^i,yj,Vj+i, 



. . . ,Vr £ R} 




Example 3.22 in ;KM97j and also |Klan3j . 
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Definition 5 Let u he a distribution onXxY. Then v is (strictly) balanced for f : X xY ^ {0, 1}, 

z/Kri(i)) = i/2 = Krno)). 



Definition 6 Let err{R,i',i) = v{f ^(1 — i)\R) denote the error of an i-rectangle R. Then let 
size{i',e, f,i) = max{z^(i?) : err{R,v,t) < e}, where R runs over all rectangles in Mf. 

Define boundi^\f) = may:,^ log{l / size{v, e, f,l)), where u runs over all balanced distributions 
on X X Y . Finally, 

bound{f) = max{6o'un(i[y^(/), 6oiin(i[y^(-i/)}. 
The application to classical communication is as follows. 

Fact 3 For any function f : X xY ^ {0,1}, its (public coin) randomized communication com- 
plexity with error 1/^ is lower bounded by bound{ f). 

3 Proving Quantum Communication- Space Tradeoffs 

Suppose we are given a communicating quantum circuit that computes fi^r, i-e., the Alice circuit 
gets / inputs from X, the Bob circuit gets r inputs from Y, and they compute all outputs f{xi,yj). 
Furthermore we assume that the output for pair (i,j) is produced at a fixed gate in the circuit. 

Our approach to prove the lower bound is by slicing the circuit. Let mdisc{f) = 1/2"^. Then 
we partition the circuit in the following way. The first slice starts at the beginning, and ends when 
(i/100 qubits have been communicated, i.e., after d/lOO qubit wires have crossed between the Alice 
and Bob circuits. The next slice starts afterwards and also contains d/lOO qubits communication 
and so forth. Note that there are 0{C/d) slices, and Ir outputs, so an average slice has to make 
about Ird/C outputs. We will show that every such slice can produce only 0{S) output bits. This 
implies the desired lower bound. 

So we consider what happens at a slice. A slice starts in some state on S qubits that has been 
computed by the previous part of the computation. Then the two circuits run a protocol with 
(i/100 qubits communication. We have to show that there can be at most 0{S) output bits. At 
this point the following observation will be helpful. 

Proposition 4 Suppose there is an algorithm that on input x first receives S qubits of initial in- 
formation depending arbitrarily on x for free. Suppose the algorithm produces some output correctly 
with probability p. 

Then the same algorithm with the initial information replaced by the totally mixed state has 
success probability at least p/2^ . 

Suppose the circuit computes the correct output with probability 1/2. Then each circuit slice 
computes its outputs correctly with probability 1/2. Proposition |^ tells us that we may replace 
the initial state on S qubits by a totally mixed state, and still compute correctly with probability 
(1/2) • 1/2"^. Hence it suffices to show that any protocol with communication d/100 that attempts 
to make i bits of output has success probability exponentially small in L Then H. must be bounded 
by 0{S). What is left to do is provided by the following bipartite product result. 

Theorem 4 Suppose a quantum protocol with communication d/100 makes k < d/(1001og |Z|) 
outputs for function values f{xi,yj) of f : X xY ^ Z with mdisc{f) < 2^^. Then the probability 
that these outputs are simultaneously correct is at most (1 + o(l)) • \Z\~^ . 
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We establish this result in two steps. First we show that for each function with multiple outputs 
and small multicolor discrepancy all quantum protocols have small success probability. 

Lemma 5 If there is a quantum protocol with communication c that computes the outputs of a 
function f : xY^ ^ Z'' so that the success probability of the protocol is 1/\Z\^ -\-a (in the worst 
case), then mdisc{f) > Q{a^/2^^'^). 

Conversely, if c < — logm(iisc(/)/10 — A; log \ Z\, then the success probability of quantum pro- 
tocols with communication c is at most (1 + o(l)) • \Z\~^. 

The next step is to derive multicolor discrepancy bounds for fi^r from multicolor discrepancy 
bounds for /. 

Lemma 6 Let f : X xY ^ Z have mdisc{f) < 2^°*. Let the set O = {(^1,^1), . . • , (ik^jk)} contain 
the indices of k outputs for fi^r- Denote by fo the function that computes these outputs. Then 
mdiscifo) < 0(2-°'/4), if k < d/5. 

These two lemmas imply Theorem|lI Their proofs are in Appendix^resp. Appendix^ Now 
we can conclude the following more general version of Theorem ^ 

Theorem 5 Let f : X xY ^ Z with mdisc{f) < 1/2"^. Then every quantum protocol using space 
S that computes fi^r needs communication il.{dlrlog\Z\/S). 

Proof. Note that if 5 = Q{d), we are immediately done, since communicating the outputs requires 
at least /rlog|Z| bits. If S* < d/200, we can apply Theorem |1] and Proposition |11 Consider 
a circuit slice with communication d/100 and i outputs. Apply Theorem ^ to obtain that the 
success probability of any protocol without initial information is at most (1 + o(l)) • \Z\^^ for k 
being the minimum of i and d/(1001og \Z\). With Proposition 0] we get that this must be at least 
(1/2) -2"^, and hence k < (5 + 2)/log |Z|. In the case k = d/(1001og \ Z\) we get the contradiction 
S + 2 > A:log|Z| = d/WO to our assumption, otherwise we get i < {S + 2)/log|Z| and hence 
C/{d/WO) • (S" + 2)/ log I Z| > /r as desired. □ 

We also get the following corollary in the same way. 

Corollary 3 Let f be a function with m output bits so that for all k < d and each subset O of k 
output bits mdisc{fo) < 2^'^. Then every quantum protocol with communication C and space S 
satisfies the tradeoff CS = Q,{dm). 

4 Applications 

In this section we apply Theorem [5] and Corollary 13] to show some explicit communication-space 
tradeoffs. We have already stated our result regarding matrix and matrix-vector products over 
finite fields in the introduction (Corollary^. The only missing piece is an upper bound on the 
multicolor discrepancy of IP^ for finite fields F. 

— F 

Lemma 7 mdisc{IP ) < 

Proof. The following is proved in |MNT93j . 
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Fact 8 Let Y be a pairwise universal family of hash functions from X to Z. Let A C X , B <^ Y , 
and E <^ Z . Then 

\E\ 



Prohx(,A,heB{h{x) £ E) 



jpF 

can be changed sUghtly to give a universal family, with X = F"^ and Z = F, hy letting 
h{x) = LP^{x, y) + a for y drawn randomly from and a from F. Then the set of hash functions 
has size |y| = |F|"+^ 

To bound the multicolor discrepancy of evaluating the hash family we can set E to contain 
any single element of F. Hence for each rectangle Ax B containing at least 1^1(^/2)"- entries the 
right hand side of inequality (P) is at most |i?|{"+i)/2/(^|^|(3/2)-n . |^|) = \F\-'^/'^. This is an 
upper bound on ^{A x B) times the multicolor discrepancy, and hence also an upper bound on 
the latter itself. Smaller rectangles can have multicolor discrepancy at most thus the 

multicolor discrepancy of evaluating the hash function is at most |i^| Hence also LP^ has 

small discrepancy: its communication matrix is a rectangle in the communication matrix for the 
hash evaluation. □ 

Proof of Corollary |2J We again make use of Fact |HJ Assume that the output is encoded in 
binary in some standard way using [log|Z|] bits. Fix an arbitrary value of k output bits to get 
a subset E of possible outputs in Z. We would like to have \E\/\Z\ = 2~'^, but this is not quite 
possible, e.g. for Z being {0, . . . ,p — 1} for some prime p. If we restrict ourselves to the lower 
log(|Z|)/2 bits of the binary encoding of elements of Z, however, then each such bit is 1 resp. 
with probability 1/2 it Xj \J\Z\ for a uniformly random z G even conditioned on other bits, so 
that the probability of a fixed value of k of them is between (1/2 — Xj ^J\Z\)^ and (1/2 + Xj ^J\Z\)^ . 
Then | \E\I\Z\ - X/2^ \ < 2/y|Z|. 

Let R = A X B he any rectangle in the communication matrix. Assume that > • \Y\. 



Then the right hand side of (P) is < l^l/(\/Ml^l) = If i? is smaller, then its multicolor 

discrepancy is at most 1 / y^|X|. So we can apply Corollary with a multicolor discrepancy of at 
most |Ar|~-^/^ + 2|Z|~^/^. Note that the number of output bits we consider is log \Z\/2, and we get 
CS = l^(log \X\ • log \Z\) or r2((log whichever is smaller. □ 



5 A Direct Product Result for the Rectangle Bound 

Theorem |21 is an immediate consequence of the following direct product result for the rectangle 
bound, plus a result of Razborov |Ra92j . 

Lemma 9 Let f : X x Y ^ {0, 1} be a function and denote by f^ the problem to compute f on k 
distinct instances. Assume that bound{f) > b and that this is achieved on a balanced distribution 

Then there is a constant 7 > such that the average success probability of each classical protocol 
with communication 6/3 for fk on is at most 2~^^ for any k < b. 

The lemma is proved in Appendix O Now we state the result of Razborov |R,a92j . 

Fact 10 bound{DISJ) > en for some constant e > 0. 
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Note that the distribution used in |Ra92j is not strictly balanced, but can be changed to such 
a distribution easily. 
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A Efficient Quantum Protocols and Small Multicolor Discrepancy 

In this section we show Lemma El We first need the following fact from |Kla01j . This result 
is proved by first decomposing a quantum protocol for each of the possible values of all outputs 
into few rank one matrices, whose sum expresses the probability of this particular output on the 
inputs in the communication matrix. Then these matrices are discretized into rectangles (similar 
to | Yao93LlK?^ ). Note that we can assume that the protocol always has a pure global state here, 
since we have dropped any space restrictions. Also the players do not share any entanglement at 
the beginning of the protocol, because this is destroyed by replacing the initial state of circuit slices 
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by the totally mixed state on S qubits (which in turn can be replaced by S qubits from a pure 
state on 2S qubits). 

Fact 11 Assume there is a quantum protocol with communication c that computes a value of the 

output O from a finite set Z on each input x,y with probability Px,y 

Then for each (3 > there is a real w G [0,1], and a set of 0(2^°'^//3^) rectangles R{i) with 
weights w(R{i)) G {—w,w}, so that 

w{R{l}) e\p:c,y- P,Px,y]- 

Hence for our protocol that computes the k outputs of / we can find \Z\^ sets M{h\, . . . , 6^) of 
rectangles, so that for an input .t, y summing the weights of those rectangles in M(6i, . . . , bk) that 
contain x, y gives approximately the probability of output . . . , 6^) occurring when the protocol 
runs on x,y. For R G M(6i, . . . , we define 1{R) = {bi, . . . ,bk) as its label. Let M be the union 
of all the M{bi,. . . ,bk). 

Assume the protocol has an advantage of a over a random guess for every input. Set (3 = a/2 
when applying the above lemma. For every output value bi,. . . ,bk and for every input x,y such 
that f{x, y) = {bi, . . . , bk) we have 

^ w{R)-l/\Z\^ >a- P = a/2, 

ReM:e{R)={bi,...,bk),(x,y)eR 

and thus, defining 6{x,y,bi, . . . ,6^) = 1 — l/l-Z^I'^ if f{x,y) = {bi, ... ,bk) and —1/\Z\'^ otherwise, 

J2 wiR)6{x,y,e{R))>a/2, 

ReM:{x,y)€R 

since for all x,y : Ei?eM:(x,y)eR ^(^) ^ 1- 
Hence, by averaging, 

^ fiix, y) Yl HR^x, y, ^{R)) > «/2, 

x,y ReM:{x,y)eR 

and by exchanging sums 

J2 ^{R) {f^{Rnf-\£iR))) - m/\Z\') > a/2. 
ReM 

Consequently there must be a rectangle R with 

\l,{Rnf-\i{R)))-i,{R)/\Z\'\ > 1/(0(2^07/3^) = n{a'/2''% 
This rectangle has the stated multicolor discrepancy. 
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B A Bipartite Product Result for Multicolor Discrepancy 



In this section we prove LemmalHI Suppose that mdisc{f) < 1/2'^. Let O = {(^i, ji), • • • , {ik,jk)} 
be the set of output labels, i.e., output G O should correspond to f{xi, yj). Fix some function 
values Cjj for the elements of O, i.e., f{xi,yj) = Cij. We want to show that each rectangle in 
X'^ X y is either very small or contains a fraction of inputs satisfying these constraints that is not 
much larger than 1/\Z\^. 

Fix some rectangle i? C x y. The probability when picking x,y = xi, . . . ,xi,yi, . . .yr G R 
that f{xi,yj) = Cij for all £ O can be written as the product of conditional probabilities 

Yl Probx,yeR{f{xi,yj) = Cij I f{xa,yb) = Ca,b for (a, 6) < (i, j); (a, 6) G O), (2) 

(i,i)60 

assuming some order < on pairs For any single term in this product there are three types of 

conditions: (a, b) may satisfy 

1. a / i;by^j, 

2. a = i;b^j, 

3. a / i;b = j. 

The first type of condition involves neither Xi nor yj, the others involve exactly one of them. 
We can write a term of (jSJ as 

Eu,vProbx,yeR{f{xi, yj) = Cij \ Xa = Ua, yt = Vb for (a, b) / (i, j), and C), (3) 

where the distribution onu,v G X'"-*^ x Y^~^ is given by picking ui, . . . ,ui, vi, . . .Vr uniformly from 
those inputs in R that satisfy f{ua,Vb) = Ca,b for all (a, 6) < (i, j); {a,b) G O, and then dropping 
Ui,Vj. C denotes the conditions of the second and third type. 

For all a,b with a / z, 6 / j fix any value of Xa = Ua and yb = Vb, so that f{xa,yb) = Ca^b if 
{a,b) < and (a, 6) G O (i.e., consider a term in expectation ©)• Now we are only left with 
the conditions of the other two types. Also we obtain a rectangle R'[u, v] in X x Y, since all inputs 
but Xi,yj are fixed (see Proposition P). 

The second and third types of conditions involve either Xi or yj. Observe that each such 
condition f{xi,yb) = f{xi,Vb) = Ci^b partitions X into those Xi satisfying it and those who do not, 
and f{xa, yj) = f{ua, yj) = Caj partitions Y. There are at most k such conditions, so all m possible 
truth values of these conditions partition X xY into disjoint rectangles Mi, . . . , with m < 2^. 
We are interested in the measure of inputs with f(xi,yj) = Cij on those Xi,Xj satisfying the given 
conditions, i.e., lying in one of these 2^ rectangles. Hence we need to bound 

Prob^^y<zR'[u,v]{f{x,y) = Cij \x,y e Mf) 

for one of these rectangles M^. 

Suppose that fi{R'[u,v\ n M() < 2~'^l'^ . All such rectangles ii'[n, n together contribute 
at most k -2}^ ■ 2"^^/^ to the multicolor discrepancy of R, as we argue now. The size of R can be 
written as follows (^ is the uniform distribution on implicit domains). 
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u\,...,ui,v\,...,Vr&R \a^i,b^j I 

= Eu,y^i{R![u,v\) 

= Eu,v^^{R'[u,v]r\Mi), 

l<i<m 

where the expectation Eu^v is over the distribution in which ui, . . . ,ui,vi, . . . ,Vr are picked uni- 
formly from R, and then Ui and Vj are dropped. 

Hence if we ignore all the small rectangles R'[u, v] H in R the measure of ignored inputs is at 
most km2~^/'^: for each of the k terms in the expression ^ and the m possible outputs the above 
expectation cannot gain more than measure 2"^^/^ from these rectangles. 

So assume that fi{R'[u,v] f] M^) > 2"^^/^ always. R'[u,v] H Mi is a rectangle in X x y, and 
hence has multicolor discrepancy at most 2^'^. Recall that we are interested in 

if{x,y) = Cij \x,y £ Mi) 
= i^{R'[u, v]nMen rHcij)) I fi{R'[u, v] n Me), 

and so with 

fi{R'[u,v] n n f-\cij)) - fi{R'[u,v] n M^)/|Z| < 2"^ 
and fi{R'[u,v] f] Me) > 2~'^/2 

Prob^,yeR'[u,v]ifi^^y) = \x,y£Me)< 1/\Z\ + 2-'^/^ 

So, ignoring small rectangles that altogether contribute at most k2^2~'^/'^ multicolor discrepancy, 
we have that each term in the product of probabilities Q is at most 1/|^| + 2"*^/^, and hence the 
product is at most {l/\Z\ + 2''^/^)^ < \Z\~^ + 2 • 2''^/'^, since (1/|Z| + 7)'= < l/\Z\'' + 27 for all 
< 7 < 1/2 and \Z\ > 2. So the multicolor discrepancy is at most 0{k2^2-<^/'^) < 0(2"'^/'^). 



C A Direct Product Result for the One-Sided Rectangle Bound 

In this section we give the proof of Theorem [21 This theorem is an immediate consequence of our 
direct product result for the rectangle bound (restated here), plus the aforementioned result of 
Razborov Ra92j. 

Lemma 12 Let f : X x Y ^ {0, 1} be a function and denote by fk the problem to compute f on 
k distinct instances. Assume that bound{f) > b and that this is achieved on a balanced distribution 

V. 

Then there is a constant 7 > such that the average success probability of each classical protocol 
with communication 6/3 for fk on is at most 2~^^ for any k < b. 
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Proof. Every randomized classical protocol with some success probability p on a fixed distribution 
can be replaced by a deterministic protocol with the same success probability using standard tech- 
niques (since the success probability of a randomized protocol is an expectation over deterministic 
protocols). So assume we are given a deterministic protocol with communication c and success 
probability p for /fc. Such a protocol naturally leads to a partition of X*^ x into 2'^ rectangles 
labelled with the common output of the protocol on the inputs in these rectangles. In our case 
there are 2^*/^ rectangles. 

Since u is (strictly) balanced, on each sequence of k function values is equally likely. We 
know that each rectangle on X xY that has size at least 1 /2* cannot contain more than a fraction 
of 3/4 of 1-inputs to /. Intuitively for each sequence of k outputs that has ^}{k) ones in it, the 
correctness probability goes down by a factor of 3/4 with each 1-output and is hence exponentially 
small in k. 

Assume the following claim: 

Claim 13 Each rectangle of size > 2-^1'^ in x Y^ can contain at most a fraction of 2 ^^^^ of 
inputs having fk{x) = c for every c G {0, l}'^' with \c\ G {k/3, . . . ,k}. 

Due to a simple application of the Chernoff bound all but a fraction of 1/2^^'^') of the inputs 
have function values c with |c| > k/3. Then the overall correctness probability of the protocol on 
u'' is bounded from above by 1/2^^'^), since apart from the inputs with less than k/3 ones in the 
function value all other inputs lie in rectangles that either have an error of l-l/2^(^) or are smaller 
than 2~^/^. The latter rectangles have a combined measure of at most 2^^^ ■ 2^**/^ < 2^^(*^\ given 
that k < b. 

Let us prove the claim. Consider a rectangle R C X'' x of size 2"''/^. Fix any output string 
c G {0, l}'^ with at least k/3 ones. We are interested in the measure of inputs on R that have c as 
function value. Again we may write the measure as a product of conditional probabilities 



We skip all terms concerning the probability that f{xi,yi) = 0. So we are interested in the 
probability that f{xi,yi) = 1 conditioned on f{xj,yj) = Cj for all j < i. 
Each term may be written as follows. 



where the distribution on u,v G X^^^ x y^^i is given by picking ui, . . . , u^, vi, . . . ,Vk uniformly 
from those inputs in R that satisfy /(uj, Vj) = Cj for all j < i, and then dropping Ui,Vi. 

Again fix all Xj,yj, j ^ i in any way under the condition that f{xj, yj) = cj for all j < i. This 
leaves us with a rectangle R'[u,v] Q X xY. 

If ^{R'lujv]) < 2~^ , then we ignore R'[u,v], since all such R'[u,v] together will not influence 
the error of R significantly. More precisely, since all rectangles R'\u,v] obtained by fixing u,v are 
disjoint parts of R when extended by u,v to rectangles in X^ x Y^, the combined size of all these 
small rectangles on X^ x Y^ is at most 2"''. All rectangles ignored in this way in any of the at 
most k product terms in together have weight at most ^2"'', which is at most a k2~^ /2~^/'^ 
contribution relative to R. 

So assume that v{R'[u, v]) > 2~^ always. Then clearly R'[u, v] contains at most a fraction of 3/4 
of 1-inputs to /, by the definition of bound{f). All inputs with output c need to have a 1 as output on 



Y[Prob{f{xi,yi) = Ci\f{xj,yj) = cj for j < i). 



(4) 



Eu,vProbx,yeR{f{xi, ?/») = 1 1 = Uj,yj = vj for j / i) 



(5) 
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block i. Hence the fraction of inputs with output c on i? is at most (3/4)'^/^ + A; -2 ''/^ = 2 ^i'') ^ 
since k <b. □ 



D Classical Communication- Space Tradeoffs 

Proof of Theorem |31 First we prove the lower bound for matrix multiplication. We may 
assume that S < 7y^/2, for the constants from Theorem [21 because the outputs are included 
in the communication and so otherwise we immediately have CS^ = 17 (n^). Consider circuit slices 
of a communicating circuit for matrix multiplication, each circuit slice containing communication 
^en/{2S). Let i denote the number of outputs in any slice. If ^ < 25/7 will be able to get the 
lower bound easily. So assume there are more outputs, and choose any k = 2S/'y of them. Then we 
will apply Theorem|21 (details follow later) to show that k outputs can be computed only with success 
probability 2~'^^ , and hence (1/2) • 1/2"^ < 2~^^ with Proposition01 This leads to the contradiction 
that 5 + 1 > 25, hence the slice makes only 25/7 outputs, and so C ■ {'yen)~^ ■ 25 • 25/7 ^ 

Consider a classical protocol with ^en/{2S) = en/k bits of communication. We partition the 
universe {1, . . . ,n} of the Disjointness problems to be computed into k mutually disjoint subsets 
U{i,j) of size n/k, each associated to an output which in turn corresponds to a row/column 

pair A[i], B[j] in the input matrices A and B. Assume that there are a outputs ■ ■ ■ , {i,ja) 

involving A[i]. Each output is associated to a subset of the universe U{i,jt), and we set A[i] to 
zero on all positions that are not in one of these subsets. Then we proceed analogously with the 
columns of B. 

If the protocol computes on these restricted inputs, it has to solve k instances of Disjointness of 
size n/k each, since A[i] and B[j] contain a single block of size n/k in which both are not set to if 
and only if is one of the k outputs. Hence Theorem|2lis applicable. Note that k = 25/7 ^ 
and hence k < en/k as required. 

The proof for matrix-vector product is analogous. □ 



17 



